Free Energy Landscape of Protein-like Chains with Discontinuous Potentials 
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In this article the configurational space of two simple protein models consisting of polymers com- 
posed of a periodic sequence of four different kinds of monomers is studied as a function of tempera- 
ture. In the protein models, hydrogen bond interactions, electrostatic repulsion, and covalent bond 
vibrations are modeled by discontinuous step, shoulder and square-well potentials, respectively. The 
protein-like chains exhibit a secondary alpha helix structure in their folded states at low tempera- 
tures, and allow a natural definition of a configuration by considering which beads are bonded. Free 
energies and entropies of configurations are computed using the parallel tempering method in com- 
bination with hybrid Monte Carlo sampling of the canonical ensemble of the discontinuous potential 
system. The probability of observing the most common configuration is used to analyze the nature 
of the free energy landscape, and it is found that the model with the least number of possible bonds 
exhibits a funnel-like free energy landscape at low enough temperature for chains with fewer than 
30 beads. For longer proteins, the landscape consists of several minima, where the configuration 
with the lowest free energy changes significantly by lowering the temperature and the probability of 
observing the most common configuration never approaches one due to the degeneracy of the lowest 
accessible potential energy. 



I. INTRODUCTION 



Statistical mechanical modeling has helped signifi- 
cantly in addressing the question of why protein fold- 
ing occurs so rapidly in spite of the astronomically large 
number of possible configurations available to a protein. 
It has been suggested that folding occurs on funnel- 
shaped energy landscapes rather than involving a single 
microscopic pathway through a complicated landscapei. 
Onuchic, Dill, Wolynes and co-workers proposed that a 
"folding funnel" is the special characteristic of foldable 
proteins that directs the folding protein into the native 
state without the need for a definite pathwajs^Ti^. Ac- 
cording to this picture, topological features of the free 
energy landscape, defined in a coarse-grained sense by 
averaging over conformations of the protein with similar 
characteristics, assist the folding process by channeling or 
funneling the evolution of configurations. The folding of 
a protein is viewed as a process in which the protein glides 
down in the funnel-shaped free-energy landscape as the 
temperature drops or as time progresses along a multi- 
tude of different paths towards its native structure^i^ii^. 
According to this viewpoint, structures with low free en- 
ergies are situated within a basin of a broad energy valley 
and a protein in a configuration associated with one of 
the valleys can move quickly in the funnel to the lowest 
free energy state. 

Of course the true free energy landscape is never a sim- 
ple funnel, and the configurational space of a protein is 
a highly multi-dimensional space. Even for small pro- 
teins, its dimensionality ranks in the several hundreds^ ^. 
Within this high dimensional space, the free energy land- 
scape can feature many local minima separated by ener- 
getic and entropic barriers. 



Although the free energy landscape of proteins is of- 
ten considered to be a key component in understanding 
the mechanisms of protein folding, the characterization 
of the structure of the free energy landscape is nebu- 
lous due to the difficulty of identifying the relationship 
between different conformations of proteins and deter- 
mining whether particular configurations are within the 
same configurational basin. The difficulty of identifying 
conformations of proteins is compounded by the com- 
putational challenge of achieving converged sampling of 
available configurations for realistic protein models. 

In this paper, studies of the energy landscape of a 
protein-like chain in the absence of any fiuid are pre- 
sented. Such a study is not feasible at present for re- 
alistic models of proteins, so simplified models are used 
to capture the basic behavior of proteins. Discontinuous 
potentials are used for the interaction potentials, where 
attraction and repulsion are defined as step and shoul- 
der potentials respectively. The Hybrid Monte Carlo 
(HMC) methodic is applied for the sampling of the 
energy landscape of a protein-like chain in which the 
Monte Carlo sampling is done using parallel tempering 
(PT) and the generation of trial configurations is car- 
ried out by discontinuous molecular dynamics (DMD). 
The PT methodi^"— improves the convergence proper- 
ties of Monte Carlo sampling by decreasing the correla- 
tion length of samples in the Markov chain of statesA^.. 

It is shown that for two simple protein models, each 
consisting of a periodic sequence of four different kinds 
of bead, the folded state exhibits a secondary alpha helix 
structure. It is demonstrated that the relative configura- 
tional entropies of the protein- like chains are independent 
of temperature for the discontinuous potential models, 
which makes it possible to compute the relative config- 
urational entropies and the free energies of the configu- 



rations very accurately. Relative configurational free en- 
ergies at different temperatures can be determined from 
relative populations at those temperatures. The free en- 
ergy results can be interpreted in terms of the free energy 
landscape picture. Such understanding of the free energy 
landscape is the main objective of this work. 

In Sec. HI] the models and their parameters are de- 
scribed, and it is shown that relative configurational en- 
tropies are temperature independent in the models. A 
simplified three state model is also presented to facilitate 
the interpretation of the simulation results. In Sec. IIII[ 
the results for the observed structures, configurational 
entropy and free energy differences are presented, and the 
shape of the free energy landscape is analyzed both for 
short and long chains. Conclusions are given in Sec. IIVI 



II. MODELS OF THE PROTEIN-LIKE CHAIN 

In this article we consider a beads on a string model 
of a protein-like chain in which each bead represents an 
amino acid or residue. The chain consists of a repeated 
sequence of four different kinds of beads. While having 
four different types of beads is not enough to represent 
the twenty different types of amino acids, it preserves at 
least some of the differences between amino acids. The 
interactions between these beads are designed to mimic 
the interactions that lead to the formation of common 
motifs in protein structure, such as the alpha helix. Pre- 
vious studies suggest that chains containing only 6, 8 or 
12 monomers are too short to fold into compact states 
at low temperatures, while somewhat longer chains with 
25 monomers can capture folded helical statesii. Here, 
chains of moderate lengths of 15 to 35 beads have been 
used to facilitate the exploration of the free energy land- 
scape. 

The models analyzed here allow for attractive interac- 
tions, intended to mimic hydrogen bonds between non- 
adjacent residues, between beads separated from each 
other by An beads, where n > 1, and with additional 
restrictions on the possible hydrogen bonds to be speci- 
fied below. Several versions of the models of protein-like 
chains have been considered, but only the results for two 
of them are presented here. Models were selected based 
on the similarity of preferred structures in the model to 
those observed in real proteins. 

To make contact with real proteins, physical units are 
used in the definition of the model, although these should 
not be taken too literally. In particular, lengths will be 
expressed in Angstroms, energies in kJ/mol and masses 
in atomic mass units. 

The two models analyzed here differ in the hydrogen- 
bond potentials, while other interactions are the same. 
In total, four different potentials are used in these mod- 
els. The first kind of potential acts between the nearest 
and the next nearest neighbors and restricts the distance 
between the beads to specific ranges by applying an in- 
finite square-well potential similar to Bellemans' bonds 
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FIG. 1: Model potentials: the (a) infinite square-well po- 
tential, (b) attractive step potential, (c) repulsive shoulder 
potentials, and (d) hard core repulsion. 



modeli^. Fig. [TJa) shows the shape of this kind of po- 
tential. To mimic a covalent bond between two consec- 
utive amino acids in the protein, the distance between 
two neighboring beads is restricted to the range 3.84 A 
to 4.48 A. This potential allows these distances to vibrate 
around values close to the distance between stereocenters 
used in Ref. [l^ Bond angle vibrations are similarly rep- 
resented by defining infinite square-well potentials be- 
tween next-nearest neighbors in the chain. Restricting 
their distance to a range from 5.44 A to 6.40 A generates 
a vibration angle between 75° and 112°. For simplic- 
ity, dihedral angles are not considered in our models, but 
as discussed later, some restrictions on hydrogen bonds 
are employed to create rigidity in the backbone of the 
protein-like chain similar to the rigidity that results from 
the dihedral angle interactions in more detailed poten- 
tials. 

Hydrogen bonds are modeled by an attractive square- 
well potential, depicted in Fig. [Ijb). In all models in- 
vestigated here, the attractive forces are defined between 
beads i and i + An (with n integer) to resemble the hy- 
drogen bonds in alpha helix structures. However, the two 
models differ in the possibility of these attractive bonds 
and the values of i and n. 

In the first model, named model A, the attractive inter- 
actions act between half the same type beads such that 
bonds can be formed between two beads both with an 
index of the form i = Ak + 1, oi both with an index of 
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the form i = 4fc + 3, where k is an integer number. 

In the second model, model B, only the beads with 
index i = Ak + 2 can bond with each other, and n can- 
not be 2 or 3. This means that there is no attractive 
bond between beads separated along the chain by eight 
or twelve beads. Bonds between beads i and i + 8 as 
well as i and i + 12 are disallowed to make the occur- 
rence of turns more difficult in the protcin-like chain and 
effectively make it more rigid. This restriction has a sim- 
ilar function to torsional interaction potentials defined in 
terms of dihedral angles along the backbone of the chain 
in more detailed models where they prevent a protein 
from bending over easily. In Fig. [3] the possible attrac- 
tive bonds for the two models are presented for a chain 
of length 25 in which subsequent beads were labeled A 
through Y. It will be shown that the two models have 
different thermodynamic characteristics and important 
qualitative differences in their free energy landscape due 
to the difference in the hydrogen bonding interactions. 

For both models, the parameters for the attractive 
square- well potential, cti and (72, are chosen to be 4.64 A 
and 5.76 A with a mid point of 5.2 A, which is close 
to the translation of 5.4 A along each turn of an alpha 
helix. Compared to covalent bonds, these attractive in- 
teractions act across longer distances. The depth of the 
potential well e is 20 kJ /mol and the mass of each bead is 
set to 2 X 10~^^kg, which is close to 120 atomic mass units. 

To represent electrostatic interactions of the atoms, re- 
pulsive interactions act between beads 1 -I- 4A: and 4fc', 
where k and k' are integers and k ^ k'. The repulsive 
interaction takes the form of a shoulder potential, shown 
in Fig. [IJc) . The range of the shoulder is set to be from 
4.64 A to 7.36 A, while the height is 0.9e. The effect of 
changing the number of step repulsions in a few models 
was evaluated in terms of minimizing the free energy. It 
turned out that changing the number of repulsions does 
not have a huge impact on the shape of free energy land- 
scape around the native structure point. Since the re- 
pulsion between the beads increases the potential energy 
while decreasing the configurational entropy, the most 
common structures at low temperatures do not have any 
repulsive interactions. Therefore, the two models differ 
only in their attractive potentials, while their repulsive 
interactions are the same. 

Finally, all other bead pairs for which no covalent 
bonds, hydrogen bonds or shoulder repulsive interactions 
are defined interact via a hard sphere repulsion to ac- 
count for excluded volume interactions at short distances, 
depicted in Fig. [ijd). The hard sphere diameter is set to 
be 4.64 A, which is slightly different from the value of 
4.27 A used by Zhou et al^. 

The reduced temperature is defined as T* = {k^Tj/e, 
where e is the potential depth of the square-well attrac- 
tive interactions, and (3* is the inverse of the reduced tem- 
perature, /3* = 1/T*. Given the value of e = 20kJ/mol, 
T* = 1.0 corresponds to 2400 K. This means that /3* = 8 
(T* = |) roughly corresponds to standard room temper- 
ature, 300 K. 



A. Definition of configurations 

One of the advantages of using discontinuous poten- 
tials is the ease of comparing configurations. The bonds 
are defined using the specific range of bead separations 
rij ill which the potential energy 

equal to a specific, non-zero value. Since only one at- 
tractive bond can exist between each bead pair 
in the current models, each configuration or structure 
can be represented by a matrix of interactions in which 
the entry at row i and column j is unity if i and j are 
bonded and zero otherwise. Because bonded interactions 
largely determine the form of the protein, this matrix 
can be used to identify the configuration of the protein- 
like chain. Thus, by comparing the matrices, identical 
structures can be easily found. 

However, for ease of presentation, a more readable al- 
phabetical notation for configurations is applied. Each 
bead is represented by a subsequent letter from the al- 
phabet and each bonded interaction is shown by a pair of 
letters. The two dimensional matrix can thus be repre- 
sented by a string of alphabetical pairs. Since most of the 
studied cases involve 25-bead chains, A to Y have been 
used to label different beads. For chains longer than 26 
beads, both capital and small letters can be used. 



B. Temperature independence of relative 
configurational entropies 

The definition of configurations presented above was 
based on the presence of attractive bond interactions. 
Within the model, having a certain set of bonds (and no 
others) leads to a specific potential energy Uc for each 
configuration c. As shown below, this leads to a temper- 
ature independent relative configurational entropy. 

The configurational entropy of any particular config- 
uration c is the entropy of a sub-ensemble in which the 
phase points are restricted to those of configuration c. 
The discrete nature of the interactions allows configura- 
tional space to be partitioned into microstates by defining 
an index function for a configuration c that depends on 
the set of spatial coordinates of the chain R 

, _ J 1 if only bonds in c are present, 
Xc[R) - I Q otherwise. 

The partitioning of configurational space arises naturally 
by expanding the product in the indentity 

Mb 

i=l 

= \{{H{<J2-x,)+H{x,-a2)) (1) 

i=l 

fc=l 
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FIG. 2: Possible attractive bonds of (a) model A, and (b) model B for a chain of 25 beads. 



where nb is the number of attractive bonds in the model, 
Us = 2"'' is the number of microstates, and H{x) is the 
Heaviside function 



H{x) 



I x>0 

otherwise. 



In Eq. Xi is the distance between monomers in the 
ith bond, and (T2 is the critical distance at which an at- 
tractive hydrogen bond is formed. For notational sim- 
plicity, we order the index of configurations based on the 
number of bonds starting with the configuration with 
no bonds, xi{R) — Y[i=i-^i^i ~ '^2), and ending with 
the configuration with the maximum number of bonds, 
XnAR)=m=iH{a2-x,). 

In the canonical ensemble, the probability /obs(c, T) of 
observing a configuration c at temperature T is 



fobs 



(3) 



where Fc is the free energy of configuration C, and F is 
the full free energy of the system. By definition, one has 



1 

1^ 



dRdPxc{R)e 



-/3 Ef=i 



-V{B) 



(4) 



where N is the number of beads, m is their mass, and 
V is the potential energy function. The configurational 
entropy is related to Fc via 



Fc — Ec — TSc, 



(5) 



where Ec is the average energy of configuration c at tem- 
perature T. Since its potential energy V is always equal 
to Uc when it is finite and Xc = li one has 



Ec^Uc + -NkeT. 
Combining Eqs. (H])-®, one finds 



(6) 



3 , / 27rme 
5. = -iVfc.ln(^^ 



kBlnJ dRxciR), (7) 



where the integral is restricted to sum over configurations 
that satisfy all geometric constraints due to the infinite 
square- well and hard core repulsions. Thus the relative 
entropy of two configurations ci and C2 at a specific tem- 
perature is 



,^J_[dRx^^ (8) 



which does not depend on temperature. 

From Eqs. ([6|) and ([7]) it can be concluded that the 
free energy of a configuration is 



fcisTln j dRxdR), 



3 f2TTm\ 
F^-Uc^-NksTln[^^j 

(9) 

where the second term is the same for all the configura- 
tions at temperature T. 

Because relative configurational entropies do not de- 
pend on temperature, relative entropies can be deter- 
mined from a single run at a temperature T, using 



ASc 



AEc-^C2 - AFc-^C2 

T 

AEc^C2 , , 1 /obs(ci,T) 

- Kb ni ■ 



T 
T 



ks In 



/obs(c2,r) 

/obs(ci,r) 

/obs(c2,r)' 



(10) 

(11) 



Therefore, no approximation is necessary to calcu- 
late the relative configurational entropies in contrast to 
molecular dynamics (MP) studies utilizing smooth po- 
tentials (see e.g. Ref. [20). 



C. Simulation Techniques 

The simulation results presented here were obtained 
utilizing a sampling method that uses a combination of 
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dynamical updates based on DMD and PT exchange 
moves. In this approach, a number of rephcas are up- 
dated simultaneously using molecular dynamics (appro- 
priate for the discontinuous potential systems^) for a 
fixed amount of time. At the start of a dynamical update, 
the velocities of all beads in the chain are drawn from 
the Maxwell-Boltzmann distribution for each replica at 
the temperature appropriate for that replica. Since the 
DMD is time-reversible, exactly conserves energy and 
preserves phase space volume^^, the limit distribution of 
the Markov chain of states for each replica is canoni- 
cal at the temperature of the Markov chaini^. Further- 
more since the total energy is conserved exactly in the 
dynamics, the updates provide a rejection- free means of 
moving all degrees of freedom simultaneously. To en- 
hance the sampling efficiency, the dynamical updates are 
combined with replica exchange updates. The replica ex- 
change moves are designed so that the states at each tem- 
perature are canonically distributedi^ii^. The process 
of drawing velocities, DMD dynamics, and PT exchange 
moves is repeated until enough independent statistics on 
the frequency at which different configurations are seen 
is gathered. 



D. Simplified three state model 

In the simulations, one can easily measure the fre- 
quency of occurrence /obs(c) of configurations c at each 
temperature in the PT replica set. The accuracy of 
/obs(c) is 0(1/ /obs(c)), and thus is highest for the most 
frequently occurring (dominant) structure. For that rea- 
son, below, we will often plot the observed frequency /* 
of the most common structure, i.e. /* = maXc/obs(c), 
as a function of the inverse temperature /3. To facilitate 
the interpretation of such a plot, it is helpful to consider 
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FIG. 3: Variation of the probabilities of the most common 
structure versus the inverse temperature for the three state 
model with parameters E2 — —0.65, S2 — i, S3 = 5, and 
three different values for «.2. 



its form for the following, simplified three-state model. 
The three states are configurations with energies Ei, E2 
and £'3, and entropies 5*1, S2, and 5*3, respectively. As 
in the actual model of the protein-like chain, the values 
of entropies do not depend on temperature. The second 
state will furthermore be taken to be n fold degenerate 
(this is thus really a. n + 2 state model). 

For each state in this model, the observational fre- 
quency is 

/obs(c,/3) = , (12) 

where c ranges from 1 to 3 and 

Z{13) = e-/3i^i+Si/feB + ^^-0E,+s,/ks ^ e-^^^'+^^/^Hli) 

We will assume that Ei < E2 < E3 and 6*1 < S'2 < 5*3, 
such that configuration 1 models the native state with 
lowest energy and lowest entropy, configuration 3 models 
the unfolded state with high energy and high entropy, 
while configuration(s) 2 can be interpreted as intermedi- 
ate. Because only relative energies and entropies affect 
/*, we can set E3 and Si to zero. Furthermore, one 
can fix the temperature scale by setting Ei to —1. That 
leaves just four parameters in the model: 712, E2, S2 and 
S3 (subject to the constraint that 52 < S3). 

Figure [3] shows three examples of the behavior of /* for 
this model, corresponding to the following choices of the 
parameters: E2 = —0.65, S2 — 4:, S3 — 5, and 712 = 1, 
2, and 3, respectively. One sees a 'bouncing' signal as 
subsequent states become dominant when temperature 
is varied. There are cusp-shaped minima where the iden- 
tity of the dominant state changes. At that point, sev- 
eral configurations are equally likely. If one neglects the 
other, non-dominant, configurations at that point, then 
the value of /* at a cusp should be one over the num- 
ber of competing structures, and this is borne out by the 
plots shown in Fig. [31 which show cusp depths close to 
1/2, 1/3 and 1/4, respectively. As /3 increases (tempera- 
ture gets lower), the frequency of observing state 1 (the 
'native' state) reaching almost 100%. 

One can expect similar results for the protein-like chain 
model used in the simulations. The main difference with 
the three-state model is the presence of many more con- 
figurations. Some of these extra states with be irrelevant 
(have negligible fobs) because of their low entropy, but 
one could expect to see extra bounces in the plots for the 
real model from some addition relevant states. 



III. RESULTS 

A. Free energy landscape 

To characterize the (free) energy landscape at a specific 
temperature, the most common structures are identified 
and their relative free energies computed at that temper- 
ature. Two structures are close in the landscape if they 
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p* 


the most common structure 


/ob.(%) 


1.5 


No bond 


14.2±0.6 


14.0 


AE AT AY CG CK CS CW EI GK GO GS lY KO KS KW MQ MU OS QU SW 


9.7±0.6 


24.0 


AE AI AY CG CK CS CW EI GK GO GS lY KO KS KW MQ MU OS QU SW 


10.6±0.6 


38.4 


AQ AU AY CG CO CS CW EI EM GK GO GS GW IM KO KS OS QU SW UY 


8.5±0.6 


57.5 


AQ AU AY CG CO CS CW EI EM GK GO GS GW IM KO KS OS QU SW UY 


7.7±0.6 


72.5 


AE AI AM CG CK CO CS EI GK GS GW KO KS KW OS OW QU QY SW UY 


8.1±0.6 


87.5 


AE AI AM AQ AU AY CG CK EI EY GK IM IQ lY MQ MU OS QU QY SW UY 


8.2±0.6 


/3* 


the second most common structure 


/obs(%) 


1.5 


SW 


2.1±0.2 


14.0 


AE AI AY CG CK CO CS CW EI GK GO lY KO KW MQ MU OS OW QU SW 


8.7±0.6 


24.0 


AE AI AY CG CK CO CS CW EI GK GO lY KO KW MQ MU OS OW QU SW 


9.6±0.6 


38.4 


AE AI AY CG CK CO CS CW EI GK GO lY KO KW MQ MU OS OW QU SW 


6.6±0.4 


57.5 


AE AI AM CG CK CO CS EI GK GS GW KO KS KW OS OW QU QY SW UY 


4.5±0.4 


72.5 


AQ AU AY CG CO CS CW EI EM GK GO GS GW IM KO KS OS QU SW UY 


7.5±0.4 


87.5 


AE AU AY CG CS CW EY GK GS GW IM IQ KO KS KW MQ OS OW SW UY 


6.6±0.6 



TABLE I: Most common configurations of the model A 25-bead chain. 





the most common structure 


/obs(%) 


1.5 


No bond 


22.4 ± 1.2 


3.0 


No bond 


6.7 ± 1.0 


3.5 


BE JN 


4.0 ± 0.6 


4.2 


BE FJ NR RV 


6.5 ± 0.8 


4.5 


BE BR BV FJ FV JN NR RV 


7.5 ± 1.0 


5.3 


BE BR BV FJ FV JN NR RV 


46.4 ± 1.6 


6.0 


BE BR BV FJ FV JN NR RV 


76.0 ± 1.2 


7.5 


BE BR BV FJ FV JN NR RV 


94.1 ± 0.8 


13.5 


BE BR BV FJ FV JN NR RV 


99.9 ± 0.0 


/3* 


the second most common 


/obs(%) 


1.5 


BE 


3.5 ± 0.6 


3.0 


BE 


5.6 ± 0.8 


3.5 


BE NR 


4.0 ± 0.6 


4.2 


BE FJ JN RV 


4.9 ± 0.8 


4.5 


BE FJ JN NR RV 


6.4 ± 0.8 


5.3 


BE BR BV FJ JN NR RV 


10.1 ± 0.8 


6.0 


BE BR BV FJ JN NR RV 


6.8 ± 0.8 


7.5 


BE BR BV FJ JN NR RV 


1.9 ± 0.6 


13.5 


N/A 


N/A 



TABLE II: Most common configurations of the model B 25- 
bead chain. 



have similar configurations, which means that they have 
a large number of bonds in common. For model A, the 
dominant structures are shown in Table U while those 
for model B are given in Table HH both for a chain length 
of 25. The dominant structures at low temperatures are 
designed to be helical in nature, with long chains allow- 
ing for a primitive tertiary structure in which the he- 
lix folds back on itself (see Fig. U]). The most common 
structures at any temperature are those with the lowest 
Helmholtz free energy at that temperature. Therefore, 



at low enough temperatures, when the effect of entropy 
is small, the most common structure is the one with the 
lowest possible potential energy, which will only have at- 
tractive bonds and no repulsive bonds. Therefore, unless 
otherwise specified, here the term "bond" refers only to 
an attractive bond (or hydrogen bond) and not repul- 
sive or covalent bonds. Using their interaction matrices, 
it is relatively easy to count the number of occurrences 
of the different structures and to find the most common 
structures. 

According to the diagram in Fig. [2{b), for model B, 
the maximum number of attractive bonds is 8 for the 25- 
bead chain. As expected, the most common structure for 
model B at low temperatures, /3* > 4.5, has 8 attractive 
bonds (cf. Table |lT| and therefore has the lowest po- 
tential energy for this model. According to Table HI the 
lowest potential energy configuration in model A for the 
25-bead chain has 21 attractive bonds. However, accord- 
ing to Fig.l^Ka), 36 possible attractive bonds are available 
for the 25-bead chain in model A. This means that either 
the configurations with lower energies that have more 
than 21 attractive bonds are not geometrically accessible 
(due to constraints in the model) or their configurational 
entropies are too low to be observed at these tempera- 
tures. It will be shown later (Sec. IIII Dl) that the former 
scenario is the case. However, if the latter scenario were 
true, the lower energy configurations would become dom- 
inant by reaching lower temperatures. 

Within the framework of the model, a folding funnel is 
identified as a region of phase space points corresponding 
to a set of configurations from which the folded structure 
is easily and rapidly accessible as the temperature is low- 
ered. This means that the barriers between local min- 
ima located inside the funnel, such as those arising from 
entropic decreases associated with the formation of new 
bonds, are small. As the temperature is lowered, new 
minima appear in the funnel region of the energy land- 



scape, corresponding to nearby configurations that differ 
in relatively few bonds from the previously favored struc- 
tures. If barriers between nearby states in the landscape 
are small, the system rapidly equilibrates to the presence 
of new minima and adopts a more folded structure. Al- 
though the specific pathway through which the system 
folds may involve a number of intermediate structures, 
the intermediate structures emerge smoothly with tem- 
perature and provide a channel to the folded structure. 

A quantitative measure of the folding funnel can be ob- 
tained by examining how the dominance /* of the most 
preferred structure changes with temperature, where /* 
is the probability of observing the most common configu- 
ration. For real protein systems in which a single, folded 
structure is thermodynamically stable, one expects that 
/* is near unity for temperatures at which the protein is 
folded. Furthermore, if the protein folds readily as the in- 
verse temperature /3 increases, then we expect df* /djS to 
be large and positive in the vicinity of the inverse folding 
temperature. 

As can be seen in Table HI by decreasing the tempera- 
ture for model A, some dominant structures are observed, 
but by decreasing the temperature further, the ratios of 
their populations to the total population starts to de- 
crease and new structures become dominant. It can be 
concluded that in this model, the shape of the landscape 
changes significantly by varying the temperature, where 
at high temperatures the landscape is riddled with many 
local minima (many equally preferred structures) and one 
very deep but wide minimum (no bonded structure), and 
at low temperatures there are a few narrow deep minima. 
For model A, either there are deep local minima inside 
a funnel shaped valley or there are only a few deep local 
minima beside each other. At the studied temperatures, 
there is no structure with a very large population, which 
confirms that there is deep global minimum in the free 
energy landscape. Since the most common structures at 
each temperature differ from each other in a few bonds, 
these deep minima are located close to each other in the 
landscape but not inside a funnel in the sense that they 
are not structures that are adjacent in the configurational 
space and can only be converted into one another through 
intermediates. The barriers involved in these conversions 
are high enough to make this a slow process. For exam- 
ple, as can be seen in Table HI the first two most common 
structures at /?* = 57.5 differ in seven bonds. Hence 
there are many barriers that must be overcome to go be- 
tween the two configurations because seven bonds must 
be broken and seven new bonds must be formed. On 
the other hand these two structures share thirteen bonds 
(65% of their total bonds), which indicates that they are 
similar and therefore their locations in the landscape are 
still relatively close to each other. 

Unlike the behavior observed in model A, a single dom- 
inant structure is identified in model B by decreasing 
the temperature, where the probability /* of the most 
common structure attains a value of nearly one at low 
temperatures (See Table |lT|. For /?* > 5.3, the free en- 
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FIG. 4: Folded helical structure for model B with 8 bonds. 

ergy landscape consists of a single channel in which there 
are several minima. The most common structures for 
/3* = 6 are presented in Table IIIIl None of the seven 
most common structures have a repulsive bond. This is 
not surprising, since the formation of a repulsive bond 
both limits the number of accessible conformations and 
is energetically unfavorable. The most common struc- 
ture for 13* > 5.3, BF BR BV FJ FV JN NR RV, is the 
deepest point in the funnel, and the six other most com- 
mon structures listed in Table IIIIl differ only in one bond 
from this structure. This means that there is a funnel- 
shaped valley with a global minimum corresponding to a 
folded helix and there are a few local minima of higher 
free energy beside this deepest point of the landscape. 
According to Table HIl by lowering the temperature the 
deepest point of the funnel becomes deeper while the 
other minima become shallower, since the population of 
the most common structure reaches a value higher than 
99.9%. This implies that the funnel becomes smoother 
and steeper as the temperature decreases, and the lowest 
free energy configuration becomes more accessible. 

The variation of the probability of the most common 
structure /* for the two models as a function of temper- 
ature is shown in Fig. [5] for chains of 25 beads. For both 
models, there is a cusp-shaped minimum at which a low- 



Rank 


most common structure 


/obs(%) 


1 


BF BR BV FJ FV JN NR RV 


76.0 ± 1.2 


2 


BF BR BV FJ JN NR RV 


6.8 ± 0.8 


3 


BF BV FJ FV JN NR RV 


3.8 ± 0.6 


4 


BF BR BV FV JN NR RV 


1.9 ± 0.4 


5 


BF BR BV FJ FV JN RV 


1.3 ± 0.4 


6 


BF BR BV FJ FV NR RV 


1.0 ± 0.3 


7 


BF BR BV FJ FV JN NR 


1.0 ± 0.3 



TABLE III: Most common configurations of the model B 25- 
bead chain at /3* = 6. 
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B. 



Entropy and free energy calculation for the 
25-bead chain in model B 



FIG. 5: Variation of the probabilities of the most common 
structure /* versus the dimensionless inverse temperature /3* 
for chains with 25 beads. 



energy structure becomes dominant. The value of the 
probabilty is very low at the minimum, indicating many 
competing structures (see Sec. Ill D[) . For model A, the 
probability of the most common structure at low temper- 
atures is fluctuating around a small value of about 0.08, 
whereas for model B, the probability of the most com- 
mon structure nearly attains unity. This demonstrates 
once more that the free energy landscape for model A 
does not have a funnel-like shape. 



It will become clear in Sec. lIII Dl that even for model B, 
the funnel-like character of the free energy landscape does 
not persist for chains longer than 29 beads due to geo- 
metric frustration. 



As discussed in Sec. Ill B| and as expressed in Eq. (fTT|) . 
the relative configurational entropies and consequently 
the free energy difference of two configurations can be 
obtained from the ratio of their probabilities at a specific 
temperature. Since there are fewer possible structures in 
model B than in model A, the statistical uncertainty in 
the populations, and therefore also in the entropies and 
free energies, is smaller for model B. For this reason, and 
because it is already clear that model A does not have 
a funnel-like free energy landspace, subsequent analysis 
will focus on the characteristics of model B. 

The value of the entropy of a configuration should de- 
pend largely on the number of bonds that it has, since 
the formation of a bond restricts the distance between 
a specific pair of beads. As can be seen in Table IIVI 
although the entropies of configurations with the same 
number of bonds differ slightly, they are similar in mag- 
nitude. Typically, the entropy decreases by increasing the 
number of bonds due to additional geometric constraints, 
with the entropy loss typically on the order of Sfc^ per 
bond. Nonetheless, one sees that configurations with the 
same energy of -6e have somewhat different populations 
and therefore different entropies. 

Although in principle, the entropy difference between 
any two configurations can be calculated based on the 
ratio of their populations, often there is little overlap be- 
tween the population distributions of the most common 
structure at very low temperatures and the most common 
structure at very high temperatures (e.g., configurations 
2 and 12 of Table iTVl) . Since the configurational entropy 
difference is independent of temperature, this difficulty is 
easily overcome by using one or two intermediate config- 
urations whose population distribution do have sufficient 
overlap at some range of temperatures. Using the calcu- 
lated entropies, one can compute the relative Helmholtz 
free energy between any pair of configurations at any 
temperature. This allows one to predict the population 





configuration 




Sc/ks 












1 


AD 


0.9 


31.3 ±0.8 




configuration 


Pprcd 


/obs 


A(%) 


2 


No Bond 


0.00 


31.8 ±0.6 


1.5 


No Bond 


0.206 


0.165 


25 


3 


BP 


-1 


28.6 ± 0.6 


1.5 


BP 


0.068 


0.059 


15 


4 


BP JN 


-2 


25.1 ± 0.6 


1.5 


RV 


0.059 


0.065 


9 


5 


BP NR 


-2 


25.2 ± 0.6 


5.0 


BP BR BV PJ PV JN NR RV 


0.949 


0.941 


0.8 


6 


BP JN RV 


-3 


21.7 ± 0.4 


5.0 


BP BR BV PJ JN NR RV 


0.020 


0.019 


5 


7 


BP PJ NR RV 


-4 


17.8 ± 0.6 


5.0 


BP BV FJ PV JN NR RV 


0.010 


0.012 


17 


8 


BP PJ JN RV 


-4 


17.6 ± 0.6 


6.0 


BP BR BV PJ FV JN NR RV 


0.988 


0.980 


0.8 


9 


BP PJ JN NR RV 


-5 


13.2 ± 0.6 


6.0 


BP BR BV PJ JN NR RV 


0.005 


0.005 





10 


BP BR BV FJ JN NR RV 


-7 


3.7 ± 0.8 


9.0 


BP BR BV PJ FV JN NR RV 


0.999 


0.999 





11 


BP BV FJ PV JN NR RV 


-7 


2.9 ± 0.6 












12 


BP BR BV FJ PV JN NR RV 


-8 





TABLE V: Comparison of the predicted probability (pp 



red ) 

and their 



TABLE IV: Potential energy in units of e and relative entropy 
of the most common structures of the model B 25-bead chain. 



and the simulation results for the frequency (/obi 
relative difference (A), for the most common structures of the 
model B 25-bead. 
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FIG. 6: Variation of j3* /S.F versus the configuration index of 
Table HVl for model B, where AF is the Helmholtz free energy 
difference with configuration 1 in unit of e. 




FIG. 7: Temperature dependence of the free energy difference 
of configurations 2 and 12 of 25-bead model B, as listed in 
Table HVl in units of e. 



of any structure at any temperature and predict the tem- 
perature at which the population of two specific config- 
urations becomes equal. The free energy and entropy of 
some of the most common structures of model B for the 
25-bead chain are shown in Table ITVl 

The variation of l3*AF versus the configuration index 
of Table IIVI — which one could view as a simple way to 
plot the free energy landscape — is shown in Fig. [Bj The 
zero-point of this free energy plot was (arbitrarily) chosen 
to be the free energy of configuration 1 (AD), i.e., free 
energies were computed as AFic = Uc—TSc—{Ui — TSi), 
where Uc and Sc were taken from Table IIVI Since both 
the entropy and the energy of the configurations are de- 
creasing from configuration 1 to 12, the behavior of [3*AF 
is very different for high and low temperatures. At high 
temperatures (/3* < 3), entropy effects dominate, and 
the configuration with the largest entropy in Table IIVI 
(configuration 2) is the lowest free energy structure for 
/3* = 1.5 in Fig. [6] Note that the free energy of other 
structures increases with increasing number of attractive 
bonds. In constrast, at lower temperatures, energy ef- 
fects dominate the free energy landscape, and indeed, in 
Fig. [ni the configuration from Table IIVI which has the 
lowest potential energy, is seen to be the lowest free en- 
ergy structure for /?* = 6 and /?* = 12. 

Using the values in Table ITVl one can determine that 
for P* > 4.5, the folded helix configuration (configuration 
12) becomes dominant, since for all the temperatures in 
that range, this configuration has the lowest free energy. 
The simulation results for the population of each config- 
uration confirm this prediction. When configurations are 
ranked according to their populations, configuration 12 
ranks 30th, 13th and 5th for (3* values of 4.05, 4.2 and 
4.35 respectively, while for j3* > 4.5, it ranks first place. 

The relative free energy of configurations 2 and 12 is 
plotted against f3* in Fig.[71 It can be seen that at /3* w 4 
their free energies are equal, which implies that their pop- 
ulations are the same. Indeed, simulation results indicate 



that the populations of configuration 2 and configuration 
12 at /3* = 3.9 are 1.0% and 0.5%, respectively and at 
/3* = 4.05 are 0.6% and 1.2%, respectively, which con- 
firms that their population should became equal in the 
range 3.9 < /3* < 4.05. 

It should be noted that in our calculations, structures 
that have a population of less than 0.5% are not con- 
sidered to simplify computations. As a result, when cal- 
culating the probabilities of the configurations with 25 
beads, only 78 configurations were used. Although this 
introduces a systematic error, the predicted probabilities 
are very close to the observed ones from the simulation 
runs, as can be seen in Table IVl According to this table, 
the predicted values agree better with the simulation re- 
sults at lower temperatures. The disagreement is due to 
the fact that some configurations with very low popula- 
tions have not been considered in the probability calcula- 
tions, but since these configuration occur more frequently 
at high temperatures, neglecting their contribution leads 
to a larger error at high temperatures. 

C. Entropy and free energy calculation for the 
35-bead model B chain 

The entropies and free energies of 35-bead configura- 
tions are calculated in a similar way to the 25-bead case. 
Adding only 10 beads to the chain changes the number of 
possible attractive bonds from 8 in the 25-bead chain to 
23 in the 35-bead chain (cf. [2]), which results in a much 
more complex energy landscape. 

The dramatic change in landscape can be seen in Ta- 
ble |Vl] and Fig. ini where we see that, unlike the 25-bead 
chain, the probability of the most common structure does 
not approach unity even at very low temperatures. 

As can be seen in Table IVTl by increasing /3* (decreas- 
ing temperature) a few structures become dominant at 
different temperatures. Except for the lowest energy con- 



figuration with 23 attractive bonds, other energies are 
degenerate with multiple configurations possessing the 
same number of bonds. It will be shown in the next 
section that a structure with 23 attractive bonds is ge- 
ometrically prohibited. In fact, configurations with 21, 
22, or 23 attractive bonds have not been observed in any 
simulation runs. 

One difference between the landscape of the 35-bead 
chain and that of the 25-bead chain is the magnitude of 
the entropic barriers between configurations with differ- 
ent energies. The most common structures in Table IVll at 
high /3* have an energy of — 19e. Beside the two main con- 
figurations with the energy of — 19e, which are presented 
in Table IVIl there are at least 18 other configurations 
with the same potential energy but with lower entropies. 
Three structures with an energy of — 20e and with rel- 
atively low entropies have been observed in the runs as 
well, but, according to Table IVTl these were never among 
the first two most common structures. The configuration 
with the potential energy of — 20e that has the highest 
entropy is different in five bonds from the most common 
configuration of Table IVIl This implies that there is a 
substantial entropic barrier between these configurations. 

A second difference with the 25-bead case is that a few 
different configurations of the 35-bead chain exist at low 
temperatures and are observed with nearly the same fre- 
quency. For example, as Table IVIl shows, the two most 
common structures for 16.5 < /3* < 53 have 19 attractive 
bonds. While these two structures differ slightly in their 
populations, structurally they differ by more than one 
bond, quite unlike the seven most common structures of 
the 25-bead chain at ^* = 6 (cf. Fig. IIIH which only 
differ from each other by one bond. Since the most com- 
mon structures of the 35-bead chain at low temperatures 
share most of their bonds, they are near one another in 
the energy landscape. However, since the most common 
structure differs from other common structures by more 
than one bond, they do not necessarily lie inside a single 
valley in the landscape. A more plausible interpretation 
is that the landscape at low temperatures for the 35-bead 
chain consists of several minima that are close but not 
necessarily inside the same channel, and that the land- 
scape does not have a single deep minumum at very low 
temperatures. 

A final difference with the 25-bead case that becomes 
apparent is that the range of energies and that of en- 
tropies for the observed configurations are 8e and 32kB 
respectively for 25-bead chains, while these are 20e and 
140^5 respectively for 35-bead chains. This confirms the 
view that the landscape of the 35-bead chain is much 
wider than the 25-bead chain landscape. This also shows 
that studying the landscape requires a much wider range 
of temperatures and more replicas. 
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FIG. 8: Variation of the probabilities of the most common 
structure, /*, versus the /3* for chains with 15, 20, 25 and 29 
beads 
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FIG. 9: Variation of the probabilities of the most common 
structure, /*, versus the /3* for the chains with 29, 30 and 
35 beads. The result of the 29-bead chain from figure [8] is 
presented here as a reference. 



D. Effects of the protein-like chain length 

For 25-bead chains, the probability of the most com- 
mon structure approaches unity at low temperatures, 
while the longer 35-bead chain did not show this trend. 
There are two possible reasons for this behavior. First, 
it is possible that the studied range of temperatures was 
not sufficiently large to observe the lowest energy con- 
figuration for long chains in the simulation. The second 
possible reason is that the lowest possible energy is not 
geometrically accessible considering the criteria of model 
B. The effect of the inaccessibility of the lowest energy 
configuration is that several structures with the same en- 
ergy compete for the highest probability. While the con- 
figurational entropies of these structures are somewhat 
different, there is no configuration with a much higher 
entropy than all the other structures with the same en- 
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p* 


the most common structure 


/ob.(%) 


1.5 


No Bond 


11.7 ± 1.3 


5.25 


BF BZ Bd Bh FJ FV FZ Fd Fh JN Jd Jh NR Nd RV VZ Zd dh 


7.2 ± 0.9 


9.0 


BF BZ Bd Bh FJ FV FZ Fd Fh JN Jd Jh NR Nd RV VZ Zd dh 


18.8 ± 1.5 


16.5 


BF BR BV BZ Bh FJ FZ Fd Fh JN Jd Jh NR Nh RV Rh VZ Zd dh 


25.8 ± 1.8 


31.5 


BF BR BV BZ Bh FJ FZ Fd Fh JN Jd Jh NR Nh RV Rh VZ Zd dh 


24.7 ± 1.6 


53.63 


BF BR BV BZ Bh FJ FZ Fd Fh JN Jd Jh NR Nh RV Rh VZ Zd dh 


23.6 ± 1.6 


P* 


the second most common structure 


/ob.(%) 


1.5 


dh 


2.3 ± 0.6 


5.25 


BF BR BV BZ Bd Bh FJ Fd Fh JN Jh NR Nh RV Rh VZ Zd dh 


5.3 ± 0.9 


9.0 


BF BR BV BZ Bh FJ FZ Fd Fh JN Jd Jh NR Nh RV Rh VZ Zd dh 


15.4 ± 1.5 


16.5 


BF BZ Bd Bh FJ FV FZ Fd Fh JN Jd Jh NR Nd Nh RV VZ Zd dh 


14.4 ± 1.4 


31.5 


BF BZ Bd Bh FJ FV FZ Fd Fh JN Jd Jh NR Nd Nh RV VZ Zd dh 


16.7 ± 1.4 


53.63 


BF BZ Bd Bh FJ FV FZ Fd Fh JN Jd Jh NR Nd Nh RV VZ Zd dh 


15.2 ± 1.3 



TABLE VI: Most common configurations of the model B 35-bead chain. 



ergy, and hence none of their maximum structural prob- 
abilities approaches unity in the accessible temperature 
range. It turns out that the second scenario is much more 
plausible. To understand why, it is helpful to consider the 
thermodynamic characteristics of model B for other chain 
lengths. For chains of length 15, 20, 25, 29, 30 and 35, 
the maximum number of attractive bonds are 3, 5, 8, 12, 
17 and 23, respectively. The temperature dependence of 
the probability of the most common structure for these 
cases is shown in Figs. [5]and|ni 

One sees in Fig. [5] that for chains with 15, 20, 25 and 
29 beads, after going through one or two minima, the 
probability of the most common structure /* approaches 
unity at low temperatures. In these cases, the most prob- 
able configurations are also the ones with the lowest en- 
ergy, i.e., with the maximum number of attractive bonds. 
For the 29-bead chain there is a distinctive peak in the 
probability of the most common structure at /3* « 7.5, 
which can be explained by the large entropy difference 
between the most common structure with 11 bonds and 
the most common structure with 12 bonds, which allows 
the 11-bond configuration to become the most common 
structure for 4.35 < /3* < 9. Apparently, at /3* = 9, the 
energy difference becomes equal to the entropy difference 
times T*, so that for /?* > 9 the structure with 12 bonds 
becomes the most common structure. 

Fig. [S] shows that the situation is quite different for 
longer chains. For the 30-bead chain, the maximum pos- 
sible number of bonds is 17, but no such structure was 
observed in the simulations, even when using different 
numbers of replicas, different FT temperature sets and 
different ranges of temperatures. This strongly suggests 
that that it is impossible to satisfy the geometric con- 
straints needed to form all possible bonds. Once the 
geometric constraints cannot all be satisfied for one par- 
ticular chain length, this automatically implies that they 
can also not be satisfied for longer chains. Indeed, in the 
35 bead case, the lowest energy configuration is also not 
observed. 



As can be seen in Fig. ^ when 4.5 < /?* < 7.5, the 
probability of the most common structure increases for 
the 30-bead chain (similar to the behavior observed in 
15, 20, 25 and 29 beads chain systems). The probability 
of the most common structure then remains more or less 
unchanged up to /3* ~ 15. After this plateau region, the 
probability decreases until reaching a /?* value at which 
the probability of the two most common structures be- 
comes equal (in this case, these are the 15-bond structure 
with the highest entropy and the 16-bond structure with 
the highest entropy) , which can be seen as a minimum in 
the graph. After passing this local minimum, the struc- 
ture with 16 bonds becomes the most common structure. 
However, because there are at least six structures with 16 
bonds, the probability of the most common structure is 
not close to one even at very low temperatures. One ex- 
planation is that the structure with 17 bonds is geometri- 
cally prohibited, leading to several energetically degener- 
ate configurations with 16 bonds to become common at 
low temperatures (their relative populations depending 
on configurational entropy differences). 

A second argument for the geometric frustration of the 
lowest energy configuration for larger chain lengths can 
be found by slowly relaxing the geometric restrictions im- 
posed by the range of interaction of the attractive bonds 
in the model. If the configuration is geometrically pro- 
hibited, then by slowly increasing the bonding distance 
one should find a critical value of the range at which the 
configuration suddenly becomes accessible, and since its 
energy is lower than any other structure, that configu- 
ration should at the same time suddenly become a very 
common, if not the most common, structure. 

The attractive bonds can be formed at a range 4.6A 
< Tij <5.8A ((7i=4.6 A and CT2=5.8 A), where rij is the 
distance between beads i and j. To change the attrac- 
tive range, only a2 was increased. At (72 = 6.2A, it was 
possible to observe the lowest energy configuration for 
the 30 bead chain (with 17 attractive bonds) at low tem- 
peratures, while this structure was not observed for the 
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FIG. 10: Variation of the probabilities of the most common 
structure versus the /3* for the 30-bead chain for different 
attractive bond interaction distances (increasing (T2 from the 
initial 5.8 A to 6.4 A 6.7 A and 6.9 A 

runs with a2 < 6.1 A. Figure. fTOl illustrates this by plot- 
ting the probability of the most common structure as a 
function of temperatures for several values of a2- For 
(72 — 6.4 A, the probability of the 17 bonds structure 
approaches one around /3* = 27, and by increasing the 
value of (T2, this occurs at lower f3* , since the entropic 
barriers between the low energies configurations, such as 
the configurations with 15 bonds and 16 bonds, become 
smaller. The first bump in Fig. [10] represents a temper- 
ature region where the structure with 15 bonds becomes 
the most probable configuration, and the second bump 
occurs at higher /3* values, where a 16-bond configuration 
becomes the most probable structure. Since the entropy 
difference between the configurations with different ener- 
gies becomes smaller for larger (72 , this range of /3* , where 
the structure with 15 bonds becomes the most common 
structure, becomes smaller for larger (72 values as can be 
seen for (72 =6.7 A and (72 =6.9 A in Fig. [TUl 

We conclude that for chains smaller than 30 beads the 
landscape consists of one deep funnel at low tempera- 
tures that contains several minima. The funnel becomes 
steeper by decreasing the temperature. At very low tem- 
peratures the landscape consists of a smooth funnel with 
a very deep global minimum representing the configura- 
tion with the maximum number of bonds. But for chains 
longer than 29 beads, the landscape of the longer chains 
does not consist of one deep funnel, even for low temper- 
atures. Rather, it consists of several minima or channels 
between which there are entropic barriers that increase 
with increasing chain length. 



IV. CONCLUSIONS 

In this work two different models of a protein-like chain 
that differ primarily in the number of attractive interac- 
tions were introduced and the characteristics of their free 



energy landscape was analyzed. Fewer bonding interac- 
tions are present in the second model (model B), lead- 
ing to a system with less frustration and a free energy 
landscape that possesses fewer local minima. The mod- 
els were designed to encourage the formation of helical 
secondary structural elements and such helices were ob- 
served in model B at low temperatures. For long enough 
chains (> 17 beads), model B also allows a tertiary struc- 
ture. 

It was shown that for model B, the free energy land- 
scape of the 25-bead chain has a smooth funnel that has 
important effects on both the dynamics and the thermo- 
dynamics of the system. In this model, the free energy 
landscape at low temperatures contains a deep valley 
with several minima around it located inside one basin. 
As the temperature decreases, the deepest point of the 
funnel becomes deeper, while the minima around the 
deepest point become shallower. This trend continues 
until a temperature is reached in which all local min- 
ima in the free energy landscape have vanished and only 
a single global minimum exists. In contrast to Model B, 
Model A does not exhibit a preference for a specific native 
structure at low temperatures. This may be attributed to 
several factors, such as the lack of rigidity of the chain in 
this model, several large entropic barriers, and the pos- 
sibility of having many structures with the same energy. 

It was shown that the relative configurational entropy 
is temperature independent. Hence, using the popula- 
tions of the configurations at different temperatures, the 
relative free energy and entropy of any pair of configura- 
tions can be calculated. From the free energies of differ- 
ent structures at the studied temperatures, the popula- 
tions of all configurations at any temperature were pre- 
dicted and verified against simulation results. These re- 
sults agree reasonably with the simulation results, which 
shows one of the great advantages of using discontinuous 
potentials to study the free energy landscape. 

In model B, the single funnel morphology of the free 
energy landscape persists for chains up to 29 beads long. 
However, for chains of 30 beads or longer, the simula- 
tion results strongly suggest that the structure satisfying 
all possible attractive bonds is geometrically prohibited, 
while at the same time, the entropic barriers between the 
configurations with different energies become larger. For 
long chains, the landscape at low temperatures consists 
of a few distinct channels that are relatively close to each 
other but separated by high barriers. 

The observed landscape can provide insight into the 
shape of the landscape of actual proteins. While for small 
chains the native structure seems to be the lowest free en- 
ergy structure, the existence of several distinct funnels in 
the landscape of long chains suggests the possibility that 
the native structure of real proteins is not necessarily 
the lowest free energy structure but may correspond to 
a configurational basin that can be accessed easily dur- 
ing the folding dynamics. Another factor that should 
be considered for long proteins is the important effect of 
temperature on the morphology of the landscape. In our 
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study, the basin containing the global minimum becomes 
steeper as the temperature decreases for short chains. 
However, for longer chains, the basin becomes steeper 
while the deepest point of the landscape can shift from 
one configuration to another configuration with slightly 
different bonds over the same temperature range. Thus, 
for long proteins, the structure may be more sensitive 
to temperature fluctuations and by slightly changing the 
temperature the thermodynamically stable configuration 
can shift to a configuration that differs substantially. 

The simulation results presented here can be used to 
analyze the dynamics of the protein-like chain by com- 
puting the first-passage time solution for the transition 
rates among the individual microstates. The individual 
rates between microstates can then be incorporated into 
a Markovian model of the relaxation of the chain and the 



dynamics of the folding process examined to probe how 
features in the energy landscape determine the relaxation 
profile of the protein-like chair^. 
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